zero-shot model
- North America > United States > California (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
- (2 more...)
Robust Fine-tuning of Zero-shot Models via Variance Reduction
When fine-tuning zero-shot models like CLIP, our desideratum is for the fine-tuned model to excel in both in-distribution (ID) and out-of-distribution (OOD). Recently, ensemble-based models (ESM) have been shown to offer significant robustness improvement, while preserving high ID accuracy. However, our study finds that ESMs do not solve the ID-OOD trade-offs: they achieve peak performance for ID and OOD accuracy at different mixing coefficients. When optimized for OOD accuracy, the ensemble model exhibits a noticeable decline in ID accuracy, and vice versa. In contrast, we propose a sample-wise ensembling technique that can simultaneously attain the best ID and OOD accuracy without the trade-offs.
Rethinking Plant Disease Diagnosis: Bridging the Academic-Practical Gap with Vision Transformers and Zero-Shot Learning
Benabbas, Wassim, Brahimi, Mohammed, Akhrouf, Samir, Fortas, Bilal
Recent advances in deep learning have enabled significant progress in plant disease classification using leaf images. Much of the existing research in this field has relied on the PlantVillage dataset, which consists of well-centered plant images captured against uniform, uncluttered backgrounds. Although models trained on this dataset achieve high accuracy, they often fail to generalize to real-world field images, such as those submitted by farmers to plant diagnostic systems. This has created a significant gap between published studies and practical application requirements, highlighting the necessity of investigating and addressing this issue. In this study, we investigate whether attention-based architectures and zero-shot learning approaches can bridge the gap between curated academic datasets and real-world agricultural conditions in plant disease classification. We evaluate three model categories: Convolutional Neural Networks (CNNs), Vision Transformers, and Contrastive Language-Image Pre-training (CLIP)-based zero-shot models. While CNNs exhibit limited robustness under domain shift, Vision Transformers demonstrate stronger generalization by capturing global contextual features. Most notably, CLIP models classify diseases directly from natural language descriptions without any task-specific training, offering strong adaptability and interpretability. These findings highlight the potential of zero-shot learning as a practical and scalable domain adaptation strategy for plant health diagnosis in diverse field environments.
- South America > Peru (0.05)
- Africa > Middle East > Algeria > M'Sila Province > M'Sila (0.04)
- Africa > Middle East > Algeria > Bordj Bou Arreridj Province > Bordj Bou Arreridj (0.04)
- (2 more...)
- Health & Medicine (1.00)
- Food & Agriculture > Agriculture (1.00)
Xihe: Scalable Zero-Shot Time Series Learner Via Hierarchical Interleaved Block Attention
Sun, Yinbo, Fang, Yuchen, Zhu, Zhibo, Li, Jia, Liu, Yu, Deng, Qiwen, Zhou, Jun, Yu, Hang, Lu, Xingyu, Ma, Lintao
The rapid advancement of time series foundation models (TSFMs) has been propelled by migrating architectures from language models. While existing TSFMs demonstrate impressive performance, their direct adoption of cross-domain architectures constrains effective capture of multiscale temporal dependencies inherent to time series data. This limitation becomes particularly pronounced during zero-shot transfer across datasets with divergent underlying patterns and sampling strategies. To address these challenges, we propose Hierarchical Interleaved Block Attention (HIBA) which employs hierarchical inter- and intra-block sparse attention to effectively capture multi-scale dependencies. Intra-block attention facilitates local information exchange, and inter-block attention operates across blocks to capture global temporal pattern interaction and dynamic evolution. Leveraging the HIBA architecture, we introduce Xihe, a scalable TSFM family spanning from an ultra-efficient 9.5M parameter configuration to high-capacity 1.5B variant. Evaluated on the comprehensive GIFT-Eval benchmark, our most compact Xihe-tiny model (9.5M) surpasses the majority of contemporary TSFMs, demonstrating remarkable parameter efficiency. More impressively, Xihe-max (1.5B) establishes new state-of-the-art zero-shot performance, surpassing previous best results by a substantial margin. This consistent performance excellence across the entire parameter spectrum provides compelling evidence for the exceptional generalization capabilities and architectural superiority of HIBA.
- North America > United States > California (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Asia > Singapore (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
- (2 more...)